CHAPTER 6 

MODEL FITTING 


6.1 Introduction 

As most animal carcinogenesis experiments aim at determining whether or not a 
particular treatment increases the risk of cancer at one or more sites, the statistical 
methods described so far lean heavily towards techniques for hypothesis testing. 
However, many long-term animal experiments are analysed to provide an appropriate 
condensation of the information contained in the data and not merely to answer a 
yes-no question. 

Many statistical models fitted to experimental data have fruitfully influenced the 
thinking on chemical carcinogenesis. The multistage model proposed by Armitage and 
Doll (1961) has been able to account for various experimental as well as epidemiologi¬ 
cal observations. Age-specific incidence of tumours induced by continuous exposure to 
chemical carcinogens has been described successfully by such models (Lee & O’Neill, 
1971; Berry & Wagner, 1969), which also predicted increasing incidence with age as 
being a consequence of prolonged time since first exposure. This prediction was 
confirmed experimentally (Peto et al., 1975). 

On the epidemiological side, a marked regularity of age-incidence curves for most 
spontaneous human tumours of epithelial origin has been noted (Cook et al,, 1969) and 
detailed dose-response curves, as observed for the dependence of lung cancer risk on 
daily cigarette consumption, show a high degree of consistency with the multistage 
theory (Doll, 1971; Peto, 1977; Doll & Peto, 1978). Epidemiological data on the joint 
effect of two exposures are also interpretable in terms of the multistage model 
(Wahrendorf, 1984). Day and Brown (1980) have explored the multistage model 
concerning changes in risk after cessation of exposure. They were able to show both for 
experimental and epidemiological data two different types of behaviour under the 
multistage model. 

An interesting empirical observation for chemical carcinogenesis data was made by 
Druckrey et at. (1967). They noted that if d is the daily dose rate and i the median time 
to tumour induction (death with tumour), then the relationship, d ■ f 1 — constant, holds 
for many carcinogens, especially nitroso compounds. This formula is predicted by the 
Weibull model (Carlborg, 1981) and has received wide attention in the discussion of 
thresholds in chemical carcinogenesis (Chand & Hoel, 1974; Port et al., 1976). 

The fitting of statistical models has to compromise between specificity and 
identifiability. On the one hand, one would imagine that inserting all relevant 
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knowledge about the carcinogenic process into a mathematical model would result 
in many parameters which could not all be identified by the limited amount of 
information provided from a long-term animal experiment. On the other hand, models 
which relate the probability of tumour development throughout life to the dose 
administered appear very simple. 

The degree of specificity which a model might be allowed to have depends on the 
details of the experimental design, including whether and when animals are inspected 
for diagnostic purposes and whether and when animals are killed — either by scheduled 
sacrifice or by normally occurring deaths. The experimental design also specifies the 
schedule of the dose application. Of special interest in this respect are chronic 
exposures stopped at a certain time or the application of fractionated doses which can 
differ in respect to total dose, number of fractions and time between single doses. 

One essential problem with the fitting of statistical models should be stated clearly. 
These models are usually fitted to data sets from single experiments done in one strain, 
in one sex, by one route of application and by considering tumours at one site. 
Consequently, the scope of generalization of one such fit is limited, particularly in the 
absence of consistency in studies done under different conditions. 

In the following sections, we will give an overview of some statistical models which 
have been proposed for the analysis of long-term animal experiments. This will include 
simple dose-response models, time-to-tumour models and models based on different 
states in the course of the tumour development. 

6.2 Dose-response models 

As noted in Chapter 3, the probability of tumour occurrence depends on both the 
dose and the period of exposure. In many cases, however, the dose-response 
relationship at a fixed point in time may be of interest. In the absence of decreased 
survival at high doses, for example, the proportion of animals developing tumours 
during the course of a long-term study generally increases with dose. 

The shape of such dose-response curves can vary widely depending on the agent used 
(Fig. 6.1). While the dose-response curve for liver tumours induced in mice as a 
result of exposure to 2-acetylaminofluorine (2-AAF) for 24 months is nearly linear 
(Littlefield et al., 1980a), other curves may be distinctly nonlinear. The dose-response 
curve for liver tumours induced as a result of exposure to gaseous vinyl chloride (six 
hours per day for two years) increases somewhat linearly at low doses and then tends 
to level off at higher doses (Maltoni, 1975). This plateau effect is thought to be due 
to saturation of the metabolic activation mechanism for vinyl chloride in the liver 
(Gehring et al., 1978). Conversely, the dose-response curve for squamous-cell 
carcinomas of the nasal passage induced as a result of exposure to gaseous 
formaldehyde (also six hours per day for two years) shows a marked increase in 
response above 5.6ppm (Swenberg et at., 1983), possibly due to saturation of the 
mucociliary clearance mechanism. Finally, the dose-response curve for liver tumours 
resulting from ingestion of aflatoxin in the diet (Wogan et al., 1974) increases at low 
doses but levels off at high doses as a response rate approaching 100% is reached. 

These examples clearly illustrate that the dose-response curves for different chemical 
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Fig. 6.1 Examples of dose-response relationships for vinyl chloride (from Maltoril, 1975), 
aflatoxin (from Wogan el a!., 1974), 2-acetylaminofluorene (from Littlefield at at., 
1980a) and formaldehyde (from Swenberg at at., 1983) 
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agents can be quite dissimilar. This kind of variation in shape is also encountered in 
radiation carcinogenesis (Ullrich et al., 1976; Ullrich & Storer, 1979a,b; Ullrich, 1980). 
In the remainder of this section, we consider the problem of modelling the 
dose-response relationship for a given compound in order to obtain a more quantitative 
description of the data. We shall consider also the use of such models in estimating the 
response rate at doses not included in the experimental protocol. 


Some mathematical models 

The relationship between the crude proportion of animals developing tumours 
during the course of a bioassay and the level of exposure may be described by means of 
a statistical model relating the probability of tumour induction P(d) and the dose d. 
Statistical or tolerance distribution models are based on the concept that each animal 
has its own tolerance to the test compound and will develop a lesion only if that 
tolerance is exceeded. The tolerances are presumed to vary within the population 
according to some tolerance distribution, G(f), so that the probability that an animal 
selected at random will respond to dose d is given by 

P{d) = Pr{tolerance £ d] => G(d). (6.1) 

A general class of tolerance distribution models is defined by 

G(t) = F(a + fi log f), (6.2) 
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where F denotes some suitable cumulative distribution function and a and /3 > 0 are 
parameters (Chand & Hoel, 1974). Three commonly encountered models in this class 
are the probit, logit and extreme value, defined by 


F(x) — (2n) 1/2 f exp(-n 2 /2)d u, 

J — oo 

(6.3) 

FCx) = [l+exp(—*)]-, 

(6.4) 

F(x ) = 1 - exp{-exp(*)} 

(6.5) 


respectively. Since under the extreme value model G(t) — 1 — exp(— at b ), where 
a - exp(or) and ft — f>, this model is sometimes called the Weibull model (see Section ' 

6.3). : 

Stochastic or mechanistic models are based on the concept that a toxic response is 
the result of the random occurrence of one or more fundamental biological events. 

Under the multi-hit model, for example, a response is assumed to be induced once the ; 

target tissue has been ‘hit’ by k a 1 biologically effective units of dose within a specified 
time period. Assuming that the number of hits during this period follows a Poisson 
process, the probability of a response is given by 

P(d ) = Pr{at least k hits} = 1 — exp(-Ad)^—, (6.6) 

t-a S' 

where Ad > 0 denotes the expected number of hits during this period (Rat & Van 
Ryzin, 1981). When k = 1, the multi-hit model reduces to the one-hit model given by 

P(d) - 1 - e~ AJ . (6.7) 

Incorporation of background response 

All of the above models imply that the background response rate /'(0) is zero. In t 

many cases, however, the response of interest will also occur spontaneously in control 
animals (Tarone et al., 1981). Spontaneously occurring lesions may be assumed to arise 
as a result of a variety of biological mechanisms. Two commonly encountered 
assumptions in this regard are independence and additivity (Hoel, 1980). In the first 
case, spontaneous and induced lesions are presumed to occur independently of each 
other so that the probability of either a spontaneous or a treatment-induced response 
occurring is given by 

P*(d) = n 0 + (1 - n a )P(d), (6.S) 

where 0 <jt o <1 denotes the background rate of response (Abbott, 1925). Under 
additivity, spontaneously occurring effects are considered to be due to an effective 
background dose 6 > 0, with 

P*(d) = P{d + 6). (6.9) 

Note that, with the one-hit model, the independence and additivity assumptions are 
indistinguishable. A combination of both independent and additive background may be 
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represented by the model 

P*(d) = n 0 + (1 - xo)P(d + 8). (6.10) 


Fig. 


Other models 

A simple class of tolerance distribution models in which background response arises 
in neither an independent nor additive fashion is defined by 

P(d) = F(a + pd ), (6.11) 

with fi >0 as in (6.2). When /’follows the logistic distribution in (6.4), Cox (1970) has 
shown that the uniformly most powerful, unbiased test for positive slope in the 
proportion of animals responding with increasing dose is the Cochran Armitage test 
discussed in Chapter 5. Subsequently, Tarone and Gart (1980) demonstrated the 
robustness of this test by showing that it is the locally most powerful for any monotone 
increasing distribution F. However, because the model given by (6.11) involves only 
two parameters, it is less flexible than those given by (6.8), and may not always provide 
an adequate description of the observed dose-response curve. 


Armitage-Doll multistage model 

Perhaps the most widely applied model in the case of carcinogenesis is the 
Armitage-Doll multistage model (Armitage, 1982). In this case, it is assumed that a 
cell line progresses through k distinct stages prior to becoming cancerous and that the 
rate of occurrence of the ith change is of the form A., = a, 4- fad, where oc, > 0 and 
fa a. 0 for i — The parameter or,- represents the spontaneous rate of 

occurrence of the ith change in the absence of any exposure, and the rate is supposed 
to be linearly dependent on dose through fad. The probability of a response within a 
given time period is then approximately 

P*(d) -1 - ex P {-c ft («, + ftd)}, (6.12) 

where c >0 (Crump etai., 1976). Noting that the exponent in (6-12) is a polynomial in 
dose, this model can also be viewed as 

P*(d) = 1 - exp| - 2 fad 1 ), (6.13) 

l J=D J 

where the h, are subject to certain nonlinear constraints (Krewski & Van Ryzin, 1981). 

For simplicity, however, the linear constraints b;S:G are often employed in practice, I 

providing a more general model than given by (6.12) (Crump etai., 1977), 

Pharmacokinetic model 

In many cases, a chemical will require some form of metabolic activation before it 
may exert its toxic effects (Cornfield, 1977). Rai and Van Ryzin (1983), for example. 
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Fig. 6.2 


A simple pharmacokinetic model for metabolic fate of a compound 



consider the simple compartrnental model shown in Figure 6.2. Here, the administered 
dose D,(r) at time t undergoes a transformation T x to the activated form D 2 (f) and may 
then be eliminated via a second transformation T 2 . Each transformation 7] is assumed 
to follow saturable Michaelis-Menten kinetics (Karlson, 1965, p. 80), with 


Rate (7j) = 


W) 

+ A(0 ’ 


(6.14) 


where £>,, c, > 0 (i = 1, 2). Assuming that the dose is administered at a constant rate k, 
the system satisfies the nonlinear differential equations 


dA(0 , frtPrCQ 

df c, + A(/) 

d 

dA;(0 _ biD x (t) _ b 2 D 2 (t) 

df c, + Dt(f) c 2 + D 2 (t) ' 

Under the steady state conditions dO,(t)/dt = 0, it follows from (6.16) that 


(6.15) 

(6.16) 


Dl = 


a x d 

1 + a 2 d ’ 


(6.17) 


where d = Dj(f) is constant, a x = b,c 2 /b 2 c x > 0 and a 2 = ( b 2 — b x )/b 2 c , > —1/A/, with M 
being the highest dose D t (f) such that the rate of 7\ does not exceed the rate of T 2 . If 
both transformations follow linear kinetics, then a 2 = 0 in (6.17) and the effective dose 
D 2 is directly proportional to the administered dose d. 

The probability of a response is assumed to depend only on the steady-state level of 
the effective dose D 2 in (6.17), say 

P(d) = F[D${d)}. (6.18) 

Taking F to be of the one-hit form with an additive constant 

F(x) = 1 - exp{-(ar 4- fix)} (6.19) 

yields the dose-response model 

PW -1 - ]]. ( 6 .») 

where 6, = ac > 0, & 2 — 0a x > 0 and = a 2 > — l/M. This model can, depending on the 
rate coefficients governing 7j and T z , describe both downward bending curves, such as 
that noted for vinyl chloride (saturable activation), as well as ‘hockey-stick’ shaped 
curves, such as that for formaldehyde (saturable elimination). Thus, even though the 
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dose-response curve follows a simple one-hit model in terms of the effective dose D*, a 
variety of curves may still arise as a result of the saturability of the activation and 
elimination steps. 

More generally, Rai and Van Ryzin (1983) also consider F to be of the form 

F(x) -1 ~exp{-(<* + fix r )}, (6.21) 

with y >0. In this case, the overall dose-response model in (6.20) becomes 

( 6 . 22 ) 

where 8., — y. The use of the additional parameter 0 4 allows for additional curvature in 
the model P(d) which cannot be accommodated by the pharmacokinetic parameters 0 2 
and 8 3 . 

Gehring and Blau (1977) considered the somewhat more complex model shown in 
Figure 6.3. Once taken up by the body, a chemical C may be either eliminated 
immediately or activated to form a reactive metabolite RM. This in turn may be 
detoxified or react with cellular macromolecules to form covalently-bound genetic 
material (CBG). In this model, it is also possible that the reactive metabolite may be 
neutralized by nongenetic covalent binding (CBN). The covalently-bound genetic 
material may then be repaired (CBGR) or replicated (RCBG) resulting in the 
development of a genetic lesion. 


Fig. 6.3 A more complex pharmacokinetic model for metabolic fate of a compound (from 
Gehring & Blau, 1977). C, chemical; RM, reactive metabolite; Ce, excreted chemical; 
IM, inactive metabolite; CBN, covalent binding, nongenetic; CBG, covalent binding, 
genetic; CBGR, repaired covalently bound genetic material; RCBG, retained genetic 
programme, critical and noncritical 
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This model was subsequently examined in detail by Hoel et al. (1983). They assumed 
that all reactions are governed by linear first-order kinetics except for activation, 
detoxification and repair, which are allowed to be saturable in accordance with 
Michaelis-Menten kinetics. Since replication follows linear kinetics, the amount of 
damage is proportional to [CBGj, the concentration of CBG. Under this model, 
[CBG] provides a measure of the effective dose. 

Assuming that the dosing regimen is such that the concentration of a chemical [C] is 
proportional to the administered dose, one can use the two nonlinear differential 
equations describing the system first to solve for the concentration of a reactive 
metabolite [RM] in terms of [C], in steady state, and then to express [CBG] in terms of 
[RM]. 

Considering Fto be of the one-hit form (6.18), Hoel et al. (1983) {see also Anderson 
et al., 1980) noted that, under this model, the overall dose-response curve could 
assume any one of the four shapes shown in Figure 6.4, depending on which of the 
activation steps are saturable. With all processes being essentially linear, a linear 
dose-response curve results. If only repair or detoxification is saturable, the dose- 
response curve will be ‘hockey-stick’ shaped. If only activation is saturable, the shape is 
similar in form, although inverted. If more than one process is saturable, a 
combination of these shapes will occur. 

Under this more complex model, the explicit expression for the overall dose- 
response model P(d), as in (6.20), is very complicated. For the purposes of describing 

Fig. 6.4 Relationship between delivered and administered dose and different pharmacokinetic 
conditions (from Hoel et al., 1983) 
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actual data, the model in (6.22) may he fitted, however, using standard maximum 
likelihood procedures, as described below. 

Maximum, likelihood estimation 

Whatever the specific parametric form of a dose-response model, the probability of 
observing a response at dose level d depends on some unknown parameters. In 
general, there may be p such parameters f?i, . . ., 8 P , summarized as a vector 
0 = (£>„ ... , 6 p y, the probability of response being then denoted as P*(d\ 8). 
Subsequently, we shall outline how these unknown parameters can be estimated from 
the observed data by maximum likelihood methods. Suppose that a total of n animals 
are used in an experiment involving / + 1 dose levels 0 ~ d a < di< ■ • • <d, and that x t 
of the n, animals at dose d t (i = 0, 1,. .. , I) develop tumours in the course of the 
study. Assuming that each animal responds independently of all other animals in the 
experiment, we find the likelihood ot the observed outcome under any dose-response 
model P*(d\ 0) is given by 

ud) = n ("'Wr-a - (6.23) 

1=0 \Xi/ 

where P* — P*(d i> d). Those values 8 of the parameters 0 which maximize 1/(0) 
are called ‘maximum likelihood estimates’. Since maximization of L(0) using direct 
analytical procedures is generally not possible, the maximum likelihood estimator 6 of 
0 is usually obtained using iterative numerical procedures. Under mild regularity 
conditions, it can be shown theoretically that 0 is a consistent estimate for 0 as n —*•<» 
(Krewski & Van Ryzin, 1981). Under these same conditions, Vn(8 — 0) is approxim¬ 
ately normally distributed with mean 0 and variance/covariance matrix V = [(v„)]. The 
elements of the inverse of this covariance matrix V -1 = [(«”)] represent the Fisher 
information and are derived through the second derivatives of the likelihood function 
T(0) in (6.23): 

/ r)P* SP* / 

“ig-MP r!Q!} - <’■'-*. (6 ' 24) 

where c, = lim„_o» njn > 0 and Q* = 1 — P*. 

These theoretical results provide the basis for computing the estimated dose- 
response model P(d) = P*(d; 0) and, when using the estimated covariance matrix of the 
parameter estimates, to calculate corresponding confidence intervals. For this purpose, 
the matrix [(«")] is computed at 0 = 0, the actual values of the maximum likeiihood 
estimates, and using c, = w ; /«, The usual chi-square statistic 

* 2 = 2 (*, - n i Ptf/(n t P7&) (6.25) 

/-o 

may be used to assess the goodness-of-fit of P*(d). Provided that the assumed model 
P*(d) is correct, the asymptotic distribution of this statistic is chi-square with 
(/ + 1) — t > 0 degrees of freedom. 
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Estimation procedures for the multistage model (6.13) are complicated by the 
non-negativity constraints on the parameters b t . Because of these constraints, the 
asymptotic distribution of the maximum likelihood estimators will generally not be 
normal (Guess & Crump, 1978). Similarly, the usual chi-square statistic (6.25) will 
generally be inapplicable. Nonetheless, efficient algorithms for obtaining the restricted 
maximum likelihood estimators have been developed by Crump (1984) and Hartley 
and Sielken (1978). 

As an example, consider the following data on liver tumours induced in mice in a 
lifetime study of feeding dieldrin (Walker <?f at ., 1973). 


Dose (ppm): 0 1.25 2.50 5.00 

Response (x 17/156 11/60 25/58 44/60 


These data, along with the fitted extreme value model, assuming independent 
background, are shown in Figure 6.5. The maximum likelihood estimates of the 
parameters (± standard error) are a - —2.46 ± 0.50, $ = 1.66 ± 0.35 and A u = 0.106 ± 
0.024. As may be expected with most monotonically increasing data sets, the model fits 
the data reasonably well, with no evidence of lack-of-fit provided by the chi-square 
statistic ip -value > 0.4). Similar results may be obtained with other data sets (Fig. 
6.6 and Table 6.1) and with the other models discussed above. The fitted dose-response 
curves assuming additive background will also be similar, although in this case the 
likelihood surface is generally quite flat, and the parameters are thus less well 
determined. 

Fig. 6.5 Dose-response curve for dieldrln-lnduced liver tumours in mice fitted under the 
extreme value model 
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In some applications, interest centres on the added risk over background which, 
under the assumption of independence (6.8), would be 


n(d) = 


P*(d)-P*( 0) 
1 - P*(0) 


(6.26) 
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Fig. 6.6 Dose-response curves for eight compounds fitted under the extreme value model 






Table 6.1 Lesions induced by eight rodent carcinogens 


Compound 

Ei of era nee 

Species Tumour 

Duration 

Dose units 

Hexachlorobenzene 

Arnold at at. 

(19B5) 
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cytoma 
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ppm in diet 

Nitrilotriacetic acid 

Food Safety 

Council (1978) 

Rat Kidney 
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Ethyfenethiourea 

Graham et ai. 
(1975) 

Rat Thyroid 

2 years 

ppm 

W-Nitroso- 

dimethylamine 

Terracini et a/. 
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Rat Liver 

120 weeks 

ppm 
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Kuschner et at. 
(19751 

Rat Respiratory 

lifetime 

no, of 
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DDT 

Tomatis eta!. 
(1972) 

Mouse Liver 

130 weeks 

ppm 

Sodium saccharin 

Taylor & 

Friedman (1974) 

Rat Bladder 

2 years 
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Photomirex 

Chu et al. 

(1981) 

Rat Thyroid 

21 months 
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In the same way, one may wish to evaluate the dose level corresponding to a certain 
level of risk over background, q, say (0 < q < 1), which would be d q = H *(g), The 
maximum likelihood estimator of d q is defined by d q = ft"'!//), where 11(d) = [P*(d') - 
/**(0)l/[l-£*(0)]. Since Vrt 0 q - d q ) is asymptotically normally distributed with 
mean zero and variance 


sn | y - 2 a a an an 

ad LJ & hae r 30, v ’ 


(6.27) 


an approximate 100(1 — a)% confidence interval for d r/ is given by 

ct q ± z^ft/Vn, (6.28) 

where z a n denotes the 100(1 — ar/2) percentile of the standard normal distribution, and 
ft is an estimate of a obtained by replacing B by 8 in (6.27). Other possible confidence 
limit procedures, including those based on the asymptotic distribution of the 
log-likelihood (Cox & Hinkley, 1974, p. 343), are reviewed by Crump and Howe 
(1985). 

Two applications of doses associated with certain levels of risk over background are 
of particular interest. First, Mantel and Bryan (1961) proposed the use of some suitably 
low risks (for example, q = 10 -6 ) as a means of defining a ‘virtually safe’ level of 
exposure in the absence of a threshold in the dose-response curve. It is now widely 
recognized that estimation of such extreme risks is subject to considerable model 
uncertainty. Extrapolation of the data on 2-AAF-induced liver tumours shown in 
Figure 6.7 (Littlefield et al., 1980a), for example, using the probit, logit, extreme 
value, multihit and multistage models, yields estimates of virtually safe doses spanning 
several orders of magnitude. Because of this uncertainty, it has been proposed that 
some form of linear extrapolation be used to obtain a lower limit on such extreme 
doses. The assumption of low-dose linearity follows, in fact, immediately from the 
assumption of additive background, since in that case a simple Taylor expansion shows 

Fig. 6.7 Dose-response curve for 2-acetylaminofluorene(2-AAF)-induced liver tumours in mice 
fitted under the Weibull model and performance of six extrapolation procedures 
(from Krewski efa/., 1984b). X, linear extrapolation; M, multistage model; W, Weibull 
model; L, logit model; G, gamma multi-hit model; P, probit model 
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that 

17(d) =/(0)rf (6.29) 

for small d, where/(d) = dYl(d)/3d. One simple procedure which may be used for this 
purpose is to extrapolate linearly from some higher quantile such as d 0 , 0i (Van Ryzin, 
1980). (For the 2-AAF data, this form of linear extrapolation yields results close to 
those predicted by the multistage model.) A similar form of linear extrapolation has 
also been proposed by the World Health Organization (1984, p. 50). 

The second application of the above concept, which can also be termed as estimating 
certain quantiles of the dose-response curve, is to derive a measure of carcinogenic 
potency. The measure proposed by Clayson et al. (1983) is defined by 

C q = K- log I0 d q , (6.30) 

where the dose d is expressed in pmol/kg body weight/day. The logarithm of dose is 
used to put the index on an order-of-magnitude basis, with the minus sign associating 
large values of C q with low values of d q . The constant K is set equal to 7 in order that 
C, will usually lie in the range 1-10. By choosing a moderate value of q, say 
0.10 'S. q s0.50, the model dependency encountered in estimating lower quantiles is 
avoided. 

Despite its simplicity, the index C q seems to provide a useful method of ranking 
animal carcinogens. Values of C t , with q = 0.25 for a selection of suspected and 
well-known animal carcinogens are shown in Table 6.2. Saccharin, the carcinogenicity 
of which has been widely debated, is assigned a potency index of 1.8, whereas the 
highly potent 2,3,7,8-tetrachlorodibenzo-para-dioxin (TCDD) has a value of 9.1. For a 
more complex ranking system that takes into account other factors, such as the 
spectrum of neoplasia induced and genotoxicity, the reader is referred to Squire (1981) 
and the related discussion by Crump (1983) and Theiss (1983). 

Other quantitative measures of carcinogenic potency have also been proposed. 
Historically, Twort and Twort (1930,1933) and ibal! (1939) proposed several measures 
of potency in an attempt to summarize the data obtained from their experimental 
studies. For example, one of their measures was based on the time at which 25% of the 
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Table 6.2 Potencies C 025 of some selected compounds 


Compound (Rateranee) 

Spectos 

Site 

Potency 

Saccharin 

Rat 

Bladder 

1.8 

(Scientific Review Panel, 1983) 

2-Acetylaminofluorlne 

Mouse 

Bladder 

4.3 

(Littlefield et bL, 1980a) 


Liver 

4.4 

DDT 

Mouse 

Liver 

5.0 

(Thorpe & Walker, 1973) 

Aflatoxin 

Rat 

Liver 

8.6 

(Wogan et al.. 1874) 

Dioxin 

Rat 

Thyroid 

9.1 

(National Toxicology Program, 1982) 
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animals would have developed tumours. Irwin and Goodman (1946) subsequently 
considered other related measures of potency, and noted that the different indices 
tended to give somewhat similar results. Bryan and Shimkin (1943) suggested the use 
of the dose required to induce tumours in 50% of the exposed animals, the quantity on 
which Meselson and Russel’s (1977) index is defined. 

More recently, Jones et al. (1981) considered the use of the time until 50% of the 
exposed animals would die from the tumour of interest as a measure of potency. 
Crouch and Wilson (1981) took the slope parameters in the one-hit model in (6.7) as a 
measure of potency. Noting that the probability of tumour induction P(d) — Ad for low 
levels of exposure d, Crouch and Wilson also proposed using the value of A as a means 
of estimating the response rate P at a given dose d. This will be reasonable when the 
one-hit model, which is essentially linear even at moderate doses, provides an adequate 
description of the observed dose-response curve, but will be less satisfactory when the 
dose-response curve is highly nonlinear. 

Sawyer et al. (1984) proposed a potency index based on the TD 50 , or dose estimated 
to induce tumours in 50% of the exposed subjects. The method provides for the effects 
of both dose and time, and, like the method of Crouch and Wilson, is based on a 
simple exponential or one-hit model for the effects of dose. In cases where the time to 
tumour occurrence may not be directly observable, moreover, Sawyer et al. used the 
time to death with tumour present as a surrogate for the observable failure time (see 
Peto et al., 1984, for further discussion of this point). 

Fig. 6.8 Range of carcinogenic potency in male rats (from Gold et at., 1984) 
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The index proposed by Sawyer et al. (1984) has recently been calculated using an 
extensive data base of known animal carcinogens compiled by Gold et al. (1984). 
Expressing the TD 50 in terms of the daily intake of the compound relative to total body 
weight, this analysis revealed potencies varying over more than seven orders of 
magnitude (Fig. 6.8). Only a few nanograms of TCDD, for example, were required to 
induce a 50% tumour occurrence rate during the course of a rodent lifetime, whereas 
several grams of the food colours FD & C Red No. 1 and FD & C Green No. 1 were 
required to elicit the same rates of response. 


6.3 Time-to-tumour models 

Why use time-to-tumour models ? 

There are a number of situations in which the use of models based on time-to- 
tumour information has advantages. For example, when survival differs very greatly in 
different groups, one may be able to make fuller use of the data. In the extreme case, 
where near the end of an experiment there are survivors in only one group, most of the 
methods described so far cannot make use of any tumours detected in that group, as 
there is nothing with which to compare them. Where a parametric time-to-tumour 
model can be fitted, however, these data can be used to make more precise group 
comparisons. 

Time-to-tumour models may allow one to present the results fully in a concise 
manner, which allows direct comparison with results from other comparable experi¬ 
ments. As we shall see, it is often possible to fit parametric models in which certain 
parameters, common to all groups, describe the general shape of the tumour incidence 
or survival curve, while a further single parameter, estimated separately for each 
group, describes the strength of the treatment effects. If common shape parameters are 
used, the strength parameters can be used to compare directly the results of different 
experiments, which would not be possible for methods based on testing a null 
hypothesis. 

Graphical presentation of the observed and fitted time curves for tumour onset in 
treatment and control groups can be used to indicate whether there might be any 
interaction between the effect of treatment and time that should be investigated in 
more detail. The null hypothesis methods may, for example, not pick up a situation in 
which treatment increases tumour incidence early on in the study but decreases it later. 

Time-to-tumour models are also of particular use in experiments specifically aimed at 
characterizing the mode of action of the carcinogen being tested. The shape of the 
time-response curve may assist in indicating whether the animal model used is apposite 
to the human situation where information on time-response may also be available. 
Furthermore, especially in more complex designs such as stopping experiments, it may 
give insight into whether the carcinogen had initiating or promoting action. 

Finally, as noted in the previous section, time-to-tumour models may allow a more 
reliable method of Low-dose extrapolation than those based on percentages of animals 
with tumours, especially where high doses markedly reduce mortality. 

There are two main disadvantages of time-to-tumour models. One is the need to 
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make the additional assumption, as compared with nonparametric methods, that 
tumour incidence follows a particular parametric relationship with time. A poor choice 
of relationship can affect conclusions as to the carcinogenicity of the treatment under 
test. The second is that, generally, far more extensive computing is required. 

A general formulation 

In the past, time-to-tumour models have mainly been applied to visible tumours. 
However, in recent years a number of attempts have been made to consider the more 
general situation where tumours may also be fatal or incidental (for example. Hartley 
& Sielken, 1977; Kodell et al., 1982a; Peto et al., 1984). 

A simple way to model a long-term animal experiment is illustrated in Figure 6,9, 
where the possible states an animal can be in are given as boxes, and the connecting 
arrows indicate possible transitions (Kodell & Nelson, 1980). In this model, an animal 
starts in a normal disease-free state (TV) and may at some time either develop a tumour 
(T) — or perhaps more precisely be in a state where a tumour can be detected - or die 
from a cause unrelated to the tumour of interest ( D NT ). An animal with a tumour may 
also subsequently die from this unrelated cause, or may die because of the tumour 
(.Or). 

Fig. 6.9 Illustration of illness and death states with possible transitions in rodent bioassay; 

N, normal; T, tumour; D,, death from tumour; D^ r , death not from tumour (from 
Kodell & Nelson, I960) 

□ — m 

♦ ^ j-i 

a □ 

This model is aimed at the types of observation which can arise from a long-term 
experiment. The assumption that a transition is made from a normal state to a state 
where the tumour is detectable is simplistic, inasmuch as it does not take into account 
details of the underlying biological processes. However, it would not in practice be 
possible to identify all of these states, even by increasing the number of investigations. 
Even in the model proposed, the information on cause of death, required to distinguish 
between D T and D NT when a tumour is present, is often difficult or impossible to 
obtain. 

Three random variables may be used to describe the above model: 

(a) X : time to onset of tumour, or transition time from the normal state ( N ) to the 
tumour-bearing state (T) 

(b) Y : time to death due to tumour, or transition time from the normal state ( N ) to 
the death-from-tumour state (D T ) 
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(c) Z: time to death from an unrelated cause, or transition time from the normal 
state ( N ) to the death-not-from-tumour state (D NT )- 

The random variable Z is not of major concern in drawing statistical conclusions on 
the process of carcinogenesis and is not considered further, except with regard to 
assumptions about Z needed to form the likelihood functions upon which statistical 
inferences may be based. The two random variables X and Y, however, which have to 
satisfy the condition X £ Y, are of major interest for the statistical inference. 

Let G x (t ) and G v {t) be the distribution functions of X and Y, or S x [t) = 1 — G x (t) = 
pr(A' 3: t) and Sy(f) = 1 — G y (t) = pr(Y S <) the survivor functions of X and Y. 

In addition, we consider for any index X or Y the density f(t) = dG(t)/dt, the hazard 
function A(r) =/(f)/S(f) and the cumulative hazard function, A(f)= p Q a(u) du. Note 
that S(t) = exp[—A(t)j. 

The hazard function A(f) is an extremely useful tool for modelling distributions of 
random variables which represent the time to a well defined event. It can also be 
defined as the conditional probability that the event of interest occurs at time t, if it has 
not occurred before that time. Let X be the random variable of interest: 

MO - lim ES£i££ii4iL£i£),Mi. 

' a/—► o At S x (t) 

The hazard function is therefore also denoted the age-specific failure rate. 

In a long-term animal experiment, considering visible or occult tumours, four 
different events can be observed at a certain time point: 

A. appearance of a visible tumour 

B. death caused by the tumour of interest (fatal context) 

C. death from unrelated cause, tumour of interest present (incidental context) 

D. death without tumour of interest. 

It has to be noted that deaths under C and D, from a cause unrelated to the tumour 
of interest, can occur in scheduled sacrifices. Thus, animals killed because of the 
experimental design or other external reasons represent observations of type C and D, 
depending on whether the tumour of interest is seen or not seen in the particular 
animal. 

The contribution to the likelihood function of the four types of observations, 
expressed in terms of the two random variables X and Y, are as follows: 

A. density of X\ fx(0 

B. density of Y: f Y (t) 

C. pr(T<fsY): S Y {t) - S x (t) 

D. survivor function of X: S x (t). 

In a given experiment, let ft, t z , ... , t K be the distinct times at which events of the 
above type are observed; a k , b k , c k and d k (k = I,. . . , K) are the number of events of 
type A, B, C or D at time t k . The latent-failure-time approach is taken in the formation 
of likelihoods below. Although this is the most common approach to the competing 
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risks problem, strong yet unverifiable assumptions are required to form the likelihoods 
for occult tumour data (Kalbfleisch et al., 1983). 

We consider four typical types of experiments: 

(i) Visible tumours 

As the time of onset is observable directly, there is no interest in the random 
variable Y; only events of the type A and D are observed, leading to a likelihood 
function 

= ft {f x (h<)y*{SAh)Y'- 
* = 1 

With f x (t) — Xjt(l)S x (t) and Sjr(t) = exp(—A*-(0), the log-likelihood function is often 
expressed as 

“ 2 Wk — ( a k + d k )A x (t k )}, 

k = \ 

using only the hazard function and its cumulative version, which are often the basic 
entities on which parametric models are formulated. 
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(ii) Occult tumours — all tumours observed in a fatal context 

In this special situation, where no tumours are found in animals dying from 
unrelated causes, only events of types B and D are observed. The likelihood function is 

^ 2 = fl 

k-1 

or, as above, the log-likelihood function 

LL a = £ {b k log[A y ( 4 )] - (b k + d k )A r (t k )}, 

k = 1 

which are formally identical to L, and LL U respectively. 

(iii) Occult tumours - all tumours observed in an incidental context 

In this case S y (<) = 1, as no deaths due to the tumour of interest are occurring, only 
events of the types C and D being observed. In order to form a likelihood for this case 
it must be assumed that tumour-bearing and tumour-free animals of the same age have 
identical hazard functions for death unrelated to tumour (that is, intercurrent 
mortality). The likelihood function under this assumption is 

£-3= ft {l-^{4)} c *{^(4)} d ‘- 


Al 
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The log-likelihood function expressed in terms of A a -(i) would be 

K 

= 2 {<-'k log[l - exp(-A. v ( 4 )] - d k A x {t h )}. 

*=i 

(iv) Occult tumours — observed in fatal or incidental context > 

In this most general case, events of the types B, C and D are observed. In order to 
form a likelihood for this case, it must, of course, be assumed that each tumour can be 
reliably classified as either fatal or incidental. In addition, it must be assumed that 
tumour-bearing and tumour-free animals of the same age have identical hazard 
functions for death unrelated to tumour. The likelihood function under these 
assumptions is K : 

U - n Wu)} b ‘{Sr(u) - s x (t k )yns x (t k )) d *. ; 

i 

Kodell ei at. (1982a) pointed out that 

S y (f)-S*(/) = [l-!2(0]Sr(0, I 

with Q(0 “ S x (t)/S Y (t) = pr(X > 1 1 Y > t) being the conditional probability of tumour 
onset after time t, given tumour-free survival through time t. Even under the strong 
assumptions noted above, the function £y(0 does not have a simple interpretation in 
terms of time to death due to tumour alone (that is, independent of the influence of 
time to onset of tumour). 

From this it follows that j 

2-4 = ft {/y(^)} fc *{‘Sy(4)}^ Wt {l - 0(<*)r*{Q(4)} d *. 

which can be written as the product of two parts. One, ! 

= ft 

depends on $ v (/) only; the other, 

u z >= n (i - 

k=l 

on <2(0 only. 

The two components of the log-likelihood function are 

LL\ l) = 2 {b k log[A v -(4)] - (b k + c k + d k )A y (t k )}, 

k = \ 

which is similar to LL 2 above; and, taking A Q (t) = —log Q{t ), the second part 
LLf' 1 = 2 (Ci logfl - exp(~A 0 (ti)j - d k A Q (t k )}, 

k = 1 

which is similar to LL 3 . 
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In the case where all tumours are fatal or visible, nonparametric estimation of the 
survivor functions can be made by the Kaplan-Meier method, as discussed in Section 
5.3. Where all tumours are incidental, nonparametric estimation can be done by the 
method of Hoel and Walburg, as discussed in Section 5.5. Kodell et al. (1982a), who 
consider the fourth, most general case where tumours may be either fatal or incidental, 
note that the Kaplan-Meier estimator can be applied to the first pan, L or LL V\ to 
estimate S Y (t). With the assumption that the ratio Q(t) is monotonically decreasing, 
they then derive a nonparametric maximum likelihood estimation for 5*(f) as well. 

Dinse and Lagakos (1982) weaken the restriction of monotonicity of Q(t) and 
propose to estimate S x (t) in the class of all survivor functions that are stochastically 
smaller than S Y (t). The Kaplan—Meier estimate of .5 y (f) is used as a starting point in an 
iterative approach. Given this estimate, LL^ is maximized for <3(0 under the 
restriction that the resulting estimate must be a monotone decreasing survival function. 
This is enforced by applying techniques of isotonic regression. The estimate for 5 y (t) 
thus derived is then inserted into the original likelihood function, which is maximized 
again for S r (t). This process is iterated until convergence. 

Turnbull and Mitchell (1984) have proposed a simpler algorithm, by addressing the 
problem in terms of the joint distribution of time of onset of tumour (X) and time of 
death from tumour (V), which have marginal distributions 1 — S x (t) and 1 — Sy(t). They 
point out that the joint distribution can have positive mass only on a finite set of 
disjoint intervals in the (x, y)-plane, which can be constructed from the observations. 
They then use the EM-algorithm to estimate the probability masses to be attributed to 
these intervals. The marginal distributions, and hence the survivor functions 5^(0 and 
S Y (t), can then be derived from the estimate of the joint distribution. 

Choice of parametric model 

The ideal model to use would clearly be one derived from sound biological theories 
of the carcinogenic process. In practice, of course, understanding of the carcinogenic 
process is far from perfect, but one can still aim for a model which both fits in with 
available knowledge and fits observed data at least reasonably well. By far the most 
attention has been given to the multistage model and, in the case of carcinogenesis 
experiments in which the dose has been applied continuously or at regular intervals, to 
the use of the Weibull distribution derived from it. For this reason, and also because it 
has proved satisfactory for analysis of a considerable number of data sets, we will 
consider this in detail first and turn our attention to other models only at the end of this 
section. 

Multistage models and derivation of the Weibull distribution 

Armitage and Doll (1954) observed that, in humans, the age-specific incidence rates 
of many types of cancer are proportional to a power of age (or time from first 
exposure) and showed that this result would be expected under a multistage model. 
This model makes the following simplifying assumptions: 

(i) that there is a large and constant number (N) of cells at risk, 

(ii) that all the cells start in an identical state, 
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(iii) that at least one of them has to undergo a fixed number * of stages or 
transformations before a tumour appears, and 

(iv) that any cell in a given stage has the same very small, but constant probability 
per unit time of commencing the transformation into the next stage (kinetic 
rate-constants <5., $ 2 ‘ ' 4 6*-). 

Under these assumptions, the number of cells that have undergone the first Jir — 1 
stages by time t is given by 


ft rur-i rUj 

= &«-*-•■) d 1 dw 1 d« 2 - ■ ■ du*-, 

Jo Jo Jo 


NS i<V • • 

(K -1)! 


If it is further assumed that each transformation takes a constant time 
(«<!, h "2 • • • >v k .), the formula for the incidence rate l(t) becomes 


where /i = NS ; S 2 ■ • ■ S k /(k\) and w = Ef=i w s is the sum of the constant times of the 
transformations, which can be seen as the minimal time to tumour. This is a particular 
way of parametrizing the Weibull distribution. The survivor function is then 


S(0 = exp(-£t U - tv)"). (6.31) 

While this model is clearly an over-simplification, it may be expected that if cancer 
mechanisms in animals and humans are of this type, incidence rate data from 
laboratory animals, which are often inbred and presumably therefore more homoge¬ 
neous, may follow a Weibull distribution even more clearly than for humans. The 
suggestion of using Weibull distributions to analyse continuous carcinogenesis experi¬ 
ments in animals where time-to-tumour is observable was first made by Pike (1966). 

From a study of the way in which the formula was derived, it should be clear that k 
and iv are inherent properties of the process being studied and should not vary between 
groups within an experiment studying the same cancer type in the same species of 
animals. The parameter /?, on the other hand, should be affected by treatment, 
assuming that the effect of continuous application of a carcinogen will be to alter at 
least one of the kinetic rate constants. In an experiment with several dose groups, 
i = 1, . . . , /, say, one would consider different parameters /3 lt . . . , for the Weibull 
distribution of time to tumour for the animals from the respective groups, or, as will be 
outlined later, one would model the dependence of the fi, on the dose level or other 
covariates. 


Fit of Weibull distributions to visible tumour data 

Weibull distributions have been used mainly in the literature to analyse visible 
tumour data, often from mouse skin-painting experiments widely used in the 1960s and 
1970s to evaluate the carcinogenicity of tobacco-smoke condensates and of other 
chemicals. In the next few sections we consider these applications in some detail, 
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before turning to more recent work using Weibull distributions in the analysis of 
tumours not visible in life. Some of the ideas used in the visible tumour analyses (for 
example, for significance testing of treatment effects) have natural analogues for 
nonvisible tumours, but are described in detail only for the former situation. 

Maximum likelihood estimation of the parameters fi, k and w of the Weibull 
distribution is discussed in detail by Peto and Lee (1973). The contribution of the 
log-likelihood LLi of an animal dying without a tumour at time t’ is given by 

log[5*(r')] -H-)*. 

while that of an animal getting a tumour at time f is given by 

log[2 x (t")5 x (f")] = log p + log j t + { K - l)log(f" - w) - - wf. 

To illustrate the fitting of Weibull models, consider the cigar-smoke condensate data 
(Section 4.2). The maximum likelihood estimates (and standard errors obtained from 
the inverse of the information matrix) of w, k and p for the three cigar-smoke 
condensate groups are given in Table 6.3. 


Table 6.3 Parameter estimates of Weibull models fitted to the three groups, 
from data on cigar-smoke condensate 



Low dosa 

Middle dose 

High doss 

±SE 

19.5 ±3.6 

30.4 ± 10.1 

34.5 ± 4.9 

k±SE 

1.7 ±0.6 

2.3 ± 0.8 

2.0 ±0.5 

(3 ± SE) X 10 4 

2.13 ±5.95 

0.62 ± 2.25 

3.17 ±6.61 

Log-likalihood 

-131.522 

-164.084 

-180.537 


Because the estimates of k and w have not been constrained to be equal in the three 
groups, the estimates in the table do not provide a particularly useful summary of the 
relationship between cigar-smoke condensate dose and tumour incidence. The standard 
error estimates are relatively large due to the fact that the likelihood functions are 
quite flat around their maxima. Accordingly, when making inferences about the 
Weibull parameters, likelihood ratio tests should be employed rather than tests based 
on the asymptotic normality of the maximum likelihood estimators. 

Estimated percentiles often provide a useful summary of the Weibull time-to-tumour 
curves. Estimates of percentiles can be obtained directly from the estimated tumour 
incidence curves, and standard errors of estimated percentiles can be derived using the 
delta method (Miller, 1981b, pp. 25-27). Estimates of the 25th percentile (denoted T2s) 
and the 50th percentile (that is, the median, denoted T so ) for the three cigar-smoke 
condensate groups are as follows: 
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Low dose 

Middle dose 

High dose 

t 25 ±se 

84.5 ±8.3 

71.4 ±4.2 

62.5 ±3.1 

t S0 ±se 

127.6 ±19.2 

90.8 ±4.8 

77.5 ±3.5 
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The percentiles provide a much more useful summary of the Weibull curves than was 
provided by the estimates of the three Weibull parameters. They indicate that the 
tumours are appearing earlier with increasing dose of cigar-smoke condensate. The 
standard errors of the percentile estimates, with the exception of the median for the 
low-dose group, are less than 10% of thei. corresponding estimates. The standard error 
for the estimated median of the !ow-dose group is large because the estimate, in this 
case, represents an extrapolation outside the observed time period. 

To formulate the full log-likelihood function LLi, in the situation of common values 
for w and k but different parameters ft(i = 1, . . . , I) for the different experimental 
groups, we have to extend our notation slightly. Let a k , and d ki be the number of 
events of type A and D observed at time in group i (fc = 1,..., K\ i =» 1,. .., /). 
Then we have 

LLi = E I [<hrt{log ft +- log *:+(«•' i)log(4 - w)) - ( a ki + d u ) ft(r* - w)*]. 

4 = 1 [ = 1 

Maximization of this log-likelihood function, which depends on ft, . . . , ft, k and w, is 
not straightforward, but can be achieved satisfactorily by a modified Newton-Raphson 
iterative procedure. 

For the cigar-smoke condensate data, the maximum likelihood estimates of w(±5i?) 
and k{±SE) are 17,5(±6.45) and 2.8(±0.5), respectively, and the maximum likelihood 


estimates of ft; are as follows: 


Low dose (i — 1) 

Middle dose (i — 2) 

High dose (/ = 3) 

(ft ± SE) x 10 s 

2.28 ±5.22 

4.50 ± 10.22 

7.69 ± 17.28 
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The parameter estimates now are indicative of an increasing tumour rate with 
increasing dose of cigar-smoke condensate. The log-likelihood for this model, with 
common w and k but different ft, is —481.173. Fitting a model with a common ft in 
addition to common w and fc gives a log-likelihood of —491.353. Thus, the likelihood 
ratio test statistic for H 0 '-ft = ft = ft is computed as 2(491.353 — 481.173) = 20.36. 
Comparing this computed value to a table of percentiles for the chi-square distribution 
with two degrees of freedom (in general, l — 1 degrees of freedom) we find that 
p = 0,00004, indicating a strong effect of cigar-smoke condensate on tumour incidence. 

The sum of the log-likelihoods from the initial fits of separate Weibull models to 
each dose group is —131.522— 164.084— 180.537= —476.143. Thus, a test of the null 
hypothesis that w, = a*> = vy, and jr, = jc 2 = jc 3 can be based on the likelihood ratio test 
statistic, which is computed as 2(481.173 — 476.143) = 10.06. Comparing this computed 
value to a table of percentiles for the chi-square distribution with four degrees of 
freedom (in general, 21 — 2 degrees of freedom), we find that p — 0.039, indicating 
some evidence of heterogeneity. It is interesting to note that in spite of this evidence 
that the assumption of common w and k may be invalid, the likelihood ratio test 
statistic for H 0 : /Si = ft = ft differed only slightly from the log-rank test statistic 
calculated for the cigar-smoke condensate data in Section 5.6 (that is, = 20.16). 
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Peto and Lee (1973) discuss various possibilities for proceeding in the case of known 
or unknown parameters k and ft Of particular importance is the situation in which the 
main interest is in be tween-treatment comparison, that is, in the relative magnitude of 
the ft values. Then, rather than carry out full estimation of k, tv and /3,, it is 
reasonably satisfactory, and much simpler computationally, to fix k and w values from 
previous experience and compute the ft from the formula 
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Pi = Si/V„ 

where s t — E*-i a*/ is the number of animals bearing tumours in the ith group and 
ir, = Ok — tv)*, the summation being over both times of tumour and times of death 
without tumour in group i. Pike (1966) has demonstrated that the ratio of P’s between I 

different groups is virtually independent of the actual values of k and w chosen, j 

provided the (nr, tv) pair is not too far from the best fitted values. Where k and tv are j 

known, the asymptotic variance of p, is given by var ft = Pf/'s,. 

Goodness-of-fit to the Weibull distribution can be tested by dividing the experimen- i 

tal time period into J intervals (ft — 0, ft], (ft, 2)], Tj\. Within each I 

interval one compares the observed number of animals in the ith group first developing I 

a tumour in the /th interval O,, with the number expected E l} . Eg is calculated by 
Ejj = P< v i;> where / refers to the interval (j = 1,. .. , /), and u,, is calculated by 
summing, for each animal of group i surviving and tumour-free at Tj^ x , the term 
(f* — w) x ~ (ft_i - tv)" 7 , where t* = min(ft, t k ), that is, the time to death or tumour for 
animals that experience one of these events during the /th interval, or the upper limit 
of the interval for the remainder. If the numbers of tumours are too small per group, it 
will often be useful to combine these O if and E v values over groups for each time 
interval: 

/ i 

Oj = X O,, and ft = X ft,. 

1 

The statistic X 2 = (Oj — E } ) 2 /Ej should then produce an approximate chi-square j 

variable on J — 1 or J —3 degrees of freedom, depending on whether k and w were 
assumed to be known or were fitted from the data. 


Treatment effects’, estimation and significance testing 

If k and tv are known or have been estimated from the data, then the log-likelihood 
for an I group experiment is given by 

LL = X s i lo S ft - X Pi v i- 

i = l i=l 

If the parameters ft for each group depend on certain covariates as explanatory 
variables (dose, carcinogen, method of application, etc) 2 , • - • z p , where z iu is the 
value of the nth variable in the ith group, then it is convenient to relate ft to the z iu by 
the expression 

log ft - ^ Q u Zi» 


^_____ 
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or 

A = exp It 

U = 1 

where the d u are regression coefficients to be estimated. It can be said that a log-linear 
model for the A is used. The log-likelihood function then becomes 

LL — 2 ^ Aa-u - 2 v ‘ ex P It e u z iu- 

i -1 u -1 i-J u = l 

Multiple regression methods based on maximum likelihood estimation are used for 
this problem. Likelihood ratio tests can be employed to investigate the significance of 
certain covariates. Let LL m be the log likelihood of a given model with a certain 
number of covariates. Inclusion of d further covariates alters the fitted log likelihood to 
LL (2) . Under the null hypothesis that the regression coefficients 6 U for the newly 
included covariates are zero, 2(ZX n) — LL (T> ) should be approximately chi-square- 
distributed with d degrees of freedom. For detailed illustrations, see Peto and Lee 
(1973). 

Aitkin and Clayton (1980) have published a computer program to fit such regression 
models to possibly censored failure-time data as they arise in this context. Their 
program is developed in the framework of the GLIM package (Baker & Nelder, 1978) 
for fitting generalized linear models. 


Fig. 6.10 P 
fc 


i 


(I 


Support for the model 

The fourth assumption of the multistage hypothesis underlying the Weibull distribu¬ 
tion implies that, if treatment is continuous, the kinetic rate-constants for each stage do 
not depend on the age of the animal. It follows that, provided the carcinogen affects 
the first stage of the process strongly enough for it to be a reasonable approximation to 
assume all observed tumours to have arisen because of this, the age-specific incidence 
rates will depend wholly on the duration of treatment and not at all on age per se. 

In a large experiment carried out by Peto et al. (1975), 3,4-benzo[a]pyrene (Bp) was 
applied to the skin of mice in four groups of increasing size starting at 10, 25, 40 and 55 
weeks old, respectively. As can be seen from Figure 6.10, the percentage of mice 
without a tumour, when plotted against age, differed markedly between the four 
groups. However, when plotted against treatment duration, the four groups were 
virtually identical, as expected from multistage assumptions. Having shown that the 
relationships between tumour incidence and duration in the four groups were not 
significantly different, Peto et al, (1975) combined the results of the four groups to 
illustrate the overall fit to the Weibull distribution. This is shown in Figure 6.11; it is 
obvious that, over the 100-fold range of incidences from 0.25% per fortnight up to the 
massive rate of 25% per fortnight observed after 90 weeks of regular BP administra¬ 
tion, the points did approximately fit the theoretical straight line obtained by taking 
logarithms of the Weibull equation I — fi(t — w)*. The multistage model also predicts 
that, if a carcinogen has an effect directly proportional to dose on each of c (of rc) 
kinetic-rate constants, and if the dose applied is sufficiently large for the background 
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Fig. 6.10 Percentage of tumourless mice against (a) age or (b) duration of exposure to 
benzo[a]pyrene (from Peto at at., 1975) 




• = GROUP 1 
A - GROUP 2 
O = GROUP 3 
■ = GROUP 4 


rate-constants to be neglected for those c stages of the cancer process, the age-specific 
tumour incidence rate will then be proportional to dose to the power of c. Thus, if, say, 
stages two and three of a five-stage process are affected linearly by treatment, d is 
dose, 61 are background rate-constants and is the increment in rate-constant per unit 
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Fig. 6.11 Incidence rates of 10-mm epithelial tumours at successive fortnightly chartlngs , Fig. 6.12 Fi 

against duration of benrolslpyrene (BP) application, on a log-log scale from 28 bi 

weeks onwards. (The points are statistically independent, and 90% confidence 
intervals are indicated.) (from Peto et a!., 1975) 



dose for affected stages, the Weibull parameter )3 will be proportional to 

<5 i(<5 2 + (j> z d)(8 3 + <p z d)S^8 s . 

As d becomes large, this approximates to 

Another way of testing whether the observed failure times comply with the Weibull 
distribution is based on the fact that for the Weibull distribution log log[l/5(f)] = 
log f} + k log(f - iv). Note that the left-hand side can also be written as log[-log 5(f)]. 
Using a nonparametric estimate of the survivor distribution 5(f) (Kaplan-Meier 
estimate discussed in Section 5.3) and plotting its above transform against the 
logarithm of time provides a simple check of the model. 

Lee and O’Neill (1971) analysed an experiment in which BP was painted con¬ 
tinuously at 6, 12, 24 and 48 fx g per week on four groups of 300 mice. They found that 
not only could skin tumour incidence be well described by a Weibull distribution with k 
and tv common to all four groups, but that j3 was proportional to dose squared. As can 
be seen in Figure 6.12, the plots of loglog[l/5(f)] against log(f-17.70) form 
approximately parallel equidistant straight lines. The slope of the lines, 2.95, estimates 
k and is not significantly different from an integer value as suggested by the model. The 
average vertical difference between the lines is almost exactly twice the logarithm of 
the ratio of successive doses implying c = 2 and thus the results are consistent with a 
multistage hypothesis in which BP affects two out of three of the stages of the process. 
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Fig. 6.12 Fit of Weibull distribution to data from a skin-painting experiment with 
benzolajpyrene in mice {from Lee & O'Neill, 1971) 



Lee et al. (1977) describe the analysis of a series of ten mouse skin painting 
experiments in which there were a total of 55 treatment groups consisting of either 
whole cigarette smoke condensate (SWS) or various fractions of it tested at varying 
dose levels. Common values of k =3.05 and w = 11.29 were fitted to the skin tumour 
data by maximum likelihood methods and the following linear model for the remaining 
Weibull parameter 

log /J„. - M + or, + q (log dose/) 

was fitted to the responses for the ith treatment (fraction) and jth dose level. This 
approach leads to a simple description of the results in terms of the ‘tumorigenic ratio’ 
which measures the activity of the fraction relative to whole-smoke condensate on a 
weight-for-weight basis. 

Druckrey (1967) reported quantitative dose-response relationships incorporating 
time to response information for a variety of chemical carcinogens and established the 
now well-known relationship 

d • t" — const., 

where d is the daily dose and t the median ‘tumour induction time’. This empirical 
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relationship can be seen as a corollary of the Weibull model (6.31) (Carlborg, 1981). 
Consider w = 0 and the parameter fi being proportional to some power of the daily 
dose d, that is, = a ■ d m . The absence of a constant, not dose-dependent term in this 
submodel for fl implies a zero background response. Solving (6.31) for the median 
induction time gives 

0.5 = exp(— ad m t*) 
or 

dt" => {(log2)/or} 1 "” = const. 

with n - Kim. 


Noncontinuous exposure 


Although the examples considered above concerned only the analysis of experiments 
in which skin tumours were produced in mice by regular skin painting, a Weibull 
distribution has also been successfully fitted to experiments with rats in which a single 
intrapleural inoculation of asbestos resulted in mesotheliomas of the pleura (Berry & 
Wagner, 1969; Wagner et al., 1973). There are two theoretical reasons why a Weibull 
distribution might fit in this situation. One is that, although the injection of asbestos is 
given as a single dose, the asbestos is not easily destructible and remains in the animal 
for a considerable time after injection, thus simulating continuous exposure. The 
second is that, in the multistage model, if the effect of a single exposure is so large that 
a substantial proportion of cells at risk are transformed very rapidly through the first 
jr* stages, with a subsequent tumour occurring only after background transformations 
cause the remaining k — k* transformations, the incidence rate will still obey a Weibull 
distribution but with a parameter k — k* and not k. 

In other experiments, such as those described by Day and Brown (1980), animals 
have been exposed for varying lengths of time and then treatment has been stopped. It 
is clear that in many of these experiments a simple Weibull distribution does not fit the 
observed response. This is not surprising, as the fourth assumption of the multistage 
hypothesis will not hold since kinetic rate constants of the stages affected by treatment 
will presumably change on stopping treatment. A number of workers have considered 
the mathematical implication of the application of multistage models to ‘stopping 
experiments’ (Lee, 1975; Whittemore & Keller, 1978; Day & Brown, 1980; Parish, 
1981). For example, in a three-stage mode] in which treatment causing kinetic rate 
constants a-,, ar 2 and a 3 was applied up to time S and then stopped, causing reversion 
to background kinetic rate-constants S 2 and <5 3 , the incidence rate at time 
T (>S + w), in the simplified situation where the waiting time w all occurs after the 
final transformation, is given by 


N [+. ai&2 S 3 S(T - S - w) + S t 6 2 6 3 —— 




the above references give details of formulae in more general situations (x stages 
rather than three: individual waiting times for each stage). It should be noted that the 
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shape of the incidence curve with time after stopping depends on which stages it is 
assumed that the carcinogen affects. Thus, if only the first stage is affected 
(q-l > < 5 1( a 2 — 6 2 , = 63 ), the fall off in incidence compared with continuous ex¬ 

posure will be much less pronounced than if later stages are affected. In particular, 
incidence will tend to be approximately constant for some time after stopping if the 
penultimate stage only is affected, and will tend to drop sharply to background levels 
after stopping if the final stage is affected. Lee (1975) has used maximum likelihood 
methods in an attempt to distinguish formally between hypotheses in which a 
carcinogen does and does not affect a certain stage or stages. However, the 
computation involved is considerable, 

Further discussion of details of analysis of these special experimental situations is 
outside the scope of this monograph, although it is worth pointing out that the methods 
for stopping experiments can also be applied to crossover experiments in which varying 
treatments are given in varying orders to the same animals. 


Fitting Weibull distributions to data for internal tumours 

In their analysis of data from the British Industrial Biological Research Association 
(BIBRA) nitrosamine study, some of which are described in Section 4.3, Peto et al. 
(1984) successfully used related Weibull distributions to describe the distribution of 
time X to onset of tumour and of time Y to death because of tumour. Referring back 
to the general formulation given above, they assumed A, v (f) = pt* and A y (r) = Pfi K . 
The additional parameter / they referred to as the ‘fatality factor’, ranging from 0 for a 
completely nonfatal tumour to 1 for an instantly fatal tumour. 

Over the wide range of dose levels tested, / (and jc) appeared to be essentially 
invariant of dose. This allowed characterization of the dose-Tesponse relationship for 
liver and oesophageal tumours in terms of a single parameter /3 for each tumour type. 
While more experience is needed with this model, it appears to be a very useful 
approach. 

Kodeil and Nelson (1980) use the Weibull distribution within their simplified 
observational model for the carcinogenic process, which was introduced above (Fig. 
6.9). They consider the transition time from N to T, which corresponds to the random 
variable X for time to onset of tumour, to follow a Weibull distribution. Their 
parameterization of the hazard function is p t t ri - They further consider the transition 
time from T to D T , which corresponds to the random variable Y — X, being of a 
Weibull type with hazard function P 2 t* 2 as well as transition from N or T to D NT with 
hazard function /S 3 t n . Within this framework, the likelihood function is developed 
considering both natural deaths as well as scheduled sacrifices. The likelihood function 
depends on the six parameters p t , j} 2 , p 2 > Yt • Yz ar) d y 2 and can be maximized 
numerically. 

Tolley er al. (1978) also chose a Weibull function to describe transition to the tumour 
state, but chose a Gompertz function for transition to death from other causes. More 
generally, Kalbfleisch et al. (1983) discuss likelihood estimation for an arbitrary 
parametric model without necessarily making the assumption of independent compet- 
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ing risks. In principle, however, the general formulation outlined earlier in this section, 
which does not require estimation of the distribution of time to death from causes 
other than tumour, seems simpler. 


Limitations of multistage models and other modelling approaches 

In an analysis of data from a mouse-skin painting experiment, it was assumed that 
the incidence rate followed a log-normal distribution with time (Day, 1967). Subse¬ 
quent analysis by Peto et al, (1972) showed that Weibull distributions with a (rr, n>)-pair 
common to all groups provided a significantly better fit to the data than did the 
log-normal. 

Parish (1981) felt that it was unreasonable to expect animals to have an identical 
susceptibility to the effects of applied carcinogens, and suggested a model in which the 
parameter ft had a gamma distribution. To gain an impression of the likely variation in 
susceptibility, she analysed data from the ageing experiment of Peto et al. (1975) 
referred to previously, Looking at time to appearance of further tumours in animals 
according to how many tumours they already had. She concluded that the data were 
consistent with a 50-fold variation in susceptibility between the 5th and 95th percentile 
of the distribution. But even so, this variation was not large enough to make the 
distribution of time to tumour differ materially from a Weibull distribution, except 
where incidence rates were extremely high. The effect of susceptibility is to make the 
plot of log incidence against log(t — w) fall away from a straight line at high f values, 
and this may be why, in Figure 6.12, there is a discernible, slight drop-off with the 
48 mg/week dose for the last four points plotted. 

It has also been noted that in some circumstances the dose-response relationship is 
not of the form predicted by the multistage model. Davies et al. (1974), who tested 
response to seven dose levels of smoke condensate in a mouse-skin painting 
experiment, noted that there was a clear flattening off in response above doses of 
180 mg/week. They suggested that high-dose levels were killing off a proportion of the 
cells at risk due to toxic effects, thus violating the first assumption of the multistage 
model that the number of cells at risk for each animal is the same for each group. In 
certain circumstances, it may be useful to modify the multistage model to allow for this 
possibility. Hulse et al. (1968) showed that the observed incidence of epidermal and 
dermal tumours in mice following superficial external 0 -irradiation may be accounted 
for by assuming that tumour incidence is proportional to the square of the dose and 
that potential tumour cells lose their reproductive integrity according to an exponen¬ 
tially decreasing relationship with dose. For example, the dose-dependent part of the 
hazard function may be of the form 

(Po + Pid + fi 2 d 2 )exp(-a,d - ar 2 d 2 ). 

Whittemore (1978) has reviewed a number of quantitative theories of carcinogenesis. 
She presented clear evidence of the inadequacy of theories not dependent on a 
multistage process, such as the single-stage theory of Iverson and Arley (1950) and the 
multicell theory of Fisher and Hollomon (1951), and considered a number of 
alternative versions of the multistage theory. She concluded that, although the 
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multistage theory has a number of limitations (failure to distinguish between benign 
and malignant tumours, to consider the possibility of cell repair or the action of the 
host’s immune system, to consider the differences in susceptibility or to consider that 
the sensitivity of target cells to transformation may not be constant), it nevertheless 
provides a flexible, broad and biologically plausible framework in which to examine the 
gross behaviour of tumour data. 

Moolgavkar and Knudson (1981) propose a two-stage model which incorporates the 
growth and differentiation of normal target cells and intermediate cells (that is, cells in 
which the first stage has occurred). They demonstrate that experimental animal data 
and human epidemiological data are consistent with their two-stage model, noting that 
previous inferences that there are more than two stages in the development of cancer 
can be explained by differences in the growth kinetics of intermediate cells. By 
incorporating differentiation into their model, the authors are able to explain the 
age-incidence curves for some cancers (for example, certain childhood cancers), which 
cannot be explained easily in terms of a simple multistage model. 

In an attempt to use the full information from an animal experiment (including time 
to death) for the estimation of ‘safe doses’. Hartley and Sielken (1977) model the 
hazard function as a product of a dose-dependent and a time-dependent term 

X{t,d)=g{d)-h{t), 

where the time-dependent term is chosen to be 

*(0 = 2 

r— l 


Proportional hazards models 

For the analysis of rapidly lethal or observable tumours, we showed in Section 5.6 
that the appropriate methods correspond to those given in Section 5.3 for survival 
analysis. The only difference is that one uses death due to tumour, or the appearance 
of an observable tumour, as the experimental endpoint, rather than death from any 
cause. It is useful to adopt the terminology ‘failure’ to denote such well-defined events 
as appearance of an observable tumour or death, and ‘failure time’ to denote the time 
to occurrence of such an event. Methods have been developed for the analysis of 
censored failure times which require no distributional assumptions (such as Weibull 
distributed time-to-tumour). The most widely used method is the proportional hazards 
model (Cox, 1972). 

Under the proportional hazards model, the hazard function, that is, the age-specific 
failure rate for an animal with covariates z = (z it . . . , z p )’, is 

X(t, z) = A 0 (r) • exp(fl'z) 

where A„(f) is a completely unspecified hazard function, and A' = (D,, .. . , 0 P ) a vector 
of regression parameters, and p 

A'z = 2 &u z u- 

Id— l 
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For example, for a given animal, z, could be the administered dose level of a test 
compound, z 2 could be the initial body weight, z 3 could describe the row location and 
z 4 the column location of the animal’s cage. Then the magnitude of the association 
between dose level of the compound and the failure, with adjustment for the remaining 
variables, can be measured by the estimate of the parameter corresponding to Zj in 
the proportional hazards model. 

As in Section 5.3, suppose that failures are observed at K distinct times t k , k = 

1,. . . , K. Let x k denote the number of animals failing at t k , and let s* denote the sum 

of the covariate vectors z ik corresponding to the animals failing at t k , i = 1. x k . 

Then, if the number of ties (that is, x k > 1) at each t k is small, the parameter vector 0 
can be estimated by maximizing the approximate likelihood (Breslow, 1974): 

.A exp(0's A ) 

*=i exp(0'z / » i '* ’ 

where R k denotes the set of indices corresponding to animals which survived to time t k , 
and thus were at risk of failing at t k . The approximate likelihood can be maximized 
using the Newton-Raphson method, and the covariance matrix for the resulting 
estimator (5 can be estimated by the negative of the inverse of the matrix of second 
partial derivatives of log(L) (Kalbfleisch & Prentice, 1980, Chapter 4; Miller, 1981b, 

Chapter 6). To illustrate analyses based on the proportional hazards model, consider 
the cigar-smoke condensate data (Section 4.2). The data can be examined for evidence 
of a dose-related increase in tumour rates by fitting the model A(r; z) = exp(£»z)A n (r), 
where z = 0 for animals in the low-dose group, z — 19 for animals in the middle-dose 
group and z ~ 44 for animals in the high-dose group. As noted above, the hazard 
function for the low-dose group, A B (i), need not be specified. Maximizing the 
likelihood L leads to the estimate & = 0.0261 ± 0.0060, indicating that cancer risk 
increases with increasing dose. I 

Pairwise comparison of the middle-dose group to the low-dose group and of the 
high-dose group to the low-dose group can be accomplished by fitting a model which 
includes two covariates z, = 1 if the animal is from the middle-dose group and = 0 
otherwise, z 2 = 1 if the animal is from the high-dose group and =0 otherwise. The 
model 


A(f; z) = exp(6,z, + 0 2 z 2 )A o (t) 

provides estimates and d 7 such that exp($i) and exp(# 2 ) are estimates of the relative 
risk of the middle-dose group compared to the low-dose group, and of the high-dose 
group compared to the low-dose group, respectively. The results are summarized in 
Table 6.4. 


Table 6.4 Estimates of relative risks using the proportional 
hazards model, for data on cigar-smoke condensate 
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As in Section 5.6, there is good agreement between the estimates expff^zi + 0 2 z 2 ) 
and the estimates based on the regression parameter 6. 

If there are no ties (that is, x k = I for all k), L can be derived as the marginal 
likelihood based on the distribution of ranks, but in the presence of ties, the expression 
for the marginal likelihood is more complicated than L (Kalbfleisch & Prentice, 1980, 
Chapter 4). Estimates of survival curves based on the proportional hazards model are 
available (Kalbfleisch & Prentice, 1980, pp. 84-87; Miller, 1981b, pp. 133-136), and 
estimates of percentiles can be obtained from the estimated survival curves. 
Confidence intervals for percentiles can be calculated by analogy to the methods based 
on the Kaplan-Meier survival curves (Slud et al., 1984), with substitution of the 
variance expression corresponding to the proportional hazards survival curves (Kalb¬ 
fleisch & Prentice, 1980, pp. 116-117). 


Regression models for tumour prevalence 


The logistic regression model of Dinse and Lagakos (1983) for comparing treatment 
groups with respect to tumour prevalence was discussed in Section 5.5. Their 
regression model can be used in a more general setting when there are other covariates 
in addition to specific group membership or dose level. This more general regression 
context allows incorporation of several covariates and is computationally simple to 
analyse. 

Formally, Dinse and Lagakos use the regression model 


V(x, z, t) = 


pr(y=l \X**x, Z = z, r = i) 
pr(F = 01 X~x, Z = z, T = t) 


— exp{a.rw(f) + 0'z + y(r)} 


to model the odds ratio of having (y = 1) or not having (y — 0 ) the tumour at death, 
with X a binary treatment indicator, z a p-vector of covariates and T survival time. The 
scalar a and the p-vector 0 are unknown parameters with y(f) and w(r) prespecified 
functions of time. If treatment is assumed to have a constant effect on log-odds, iv(r) is 
set equal to 1 for all t. By setting H'(f) = 1 +f(t)d,!a where f(t) is some function of 
time, such as t or log t, one replaces axw(f) by ax + &xf(t ), and this allows a test of the 
hypothesis <5—0, that is, whether the proportional treatment odds relationship 
depends on time. 


integrated models for tumour prevalence and lethality 

Another method, which differs from those described above, in that it takes into 
account the possibility of simultaneous study of more than one type of tumour, but is 
applicable only to experiments in which there are a number of scheduled sacrifices, has 
been described by Turnbull and Mitchell (1978) and by Mitchell and Turnbull (1979). 
For the purposes of their method, animals in one of R treatment groups (r = 1,..., R) 

dying in M time intervals (m = 1. M) are classified as being in one of K (k — 

1,. .. , K) ‘illness states’. These illness states are defined in terms of whether an animal 
has or does not have particular tumour types. Thus, dealing with three tumours of 
interest, there are 2 3 = 8 illness states. 

Their statistical model is defined in terms of prevalence p kmr , the probability that an 
animal from group r, alive at the beginning of the mth interval, is in illness state k at 
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that time, and lethality q kmr , the conditional probability that an animal from group r 
dies during the mth interval, given that it was alive and in illness state k at the 
beginning of this interval. Data consist of w kmr , the number of animals in group r 
sacrificed (withdrawn) in interval rn and found in illness state A; d kmr , the number of 
animals from group r dying in interval m and diagnosed with illness stage k, and s mr , 
the number of animals from group r surviving through interval m. The distribution of 
such surviving animats among the illness states is not known, but can be characterized 
by s' kmr , as these unobserved counts must be taken into account when fitting statistical 
models for the prevalence and for the lethality. 

The numerator of the prevalence of an illness state is represented by w kmr 4- d kmr + 
s kmn the term s' kmr being estimated iteratively. Such values comprise data to which a 
log-linear model, formulated in terms of dependency of p kmr on explanatory factors, 
such as treatment, time and the presence of tumours of each type can be fitted. 

Similarly, a statistical model is fitted for lethality q tar . Since the lethality is a 
conditional probability, both numerator, d kmr , and denominator, d kmr + s kmr , have to 
be specified. The model used is a logistic one in which the dependency of q kmr on a 
similar set of explanatory factors is studied. 

The crucial problem of not knowing the i' lunr is dealt with as follows. Firstly, the 
distribution of the s mr survivors to the illness states is assumed to be as in the 
distribution observed in the animals which died or were killed, so that 

/ K 

S kmr “ $mr (d kmr 4* w km y) ! X (,d kmr 4- w kmr ), 

/ *=i 

Using these s' kmr , prevalence and lethality models are then fitted, which give rise to 
estimates p kmr and q kmr . These estimates are then used to re-assess the distribution of 
the s mr survivors to the illness states by the formula 

s'kmr ~ SmrPkmri^ ~ ?bnr)/( 1 h mr ), 

where h mr = T, k =i pkmrqumr is the unconditional probability of dying in group r in 
interval m. s kmr can be replaced by s' kmr and the same models fitted to the slightly 
modified data. This process can then be iterated until the distribution of the survivors 
into the illness state no longer changes. 

It should be noted that the analysis makes the (technical) assumption that illness 
state changes are made only at the beginning of each of the M time intervals and that 
sacrifices occur only immediately after the illness state changes, in order for ihe 
prevalences and the lethalities to be defined. It is thus convenient to choose partitions 
of the time axis such that each interval covers one scheduled sacrifice. 

A detailed application of this approach is given by Wahrendorf (1983). A very 
interesting feature of this method is that it allows simultaneous analysis of different 
tumour types. This opens interesting possibilities for assessing associations between 
different tumours. Berlin el al. (1979) also consider a general Markov model for 
multiple tumour types and discuss the question of identifiability. 
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6.4 Summary 

The number of animals developing tumours during the course of a conventional 
two-to-three-year rodent carcinogen bioassay will depend on the dose level to which 
they are exposed. The overall shape of the dose-response curve can, however, vary 
widely, depending on the particular agent being evaluated. Although most dose- 
response relationships generally increase with dose, this increase may be either linear 
or distinctly nonlinear. In the latter case, dose-response curves which increase rapidly 
beyond a certain dose range may be noted, as with nasal tumours induced by exposure 
to formaldehyde. Conversely, a levelling off of the rate of response may be observed at 
high doses, as with liver tumours resulting from exposure to vinyl chloride. Combina¬ 
tions of these different shapes are also possible, as with the S-shaped dose-response 
seen for aflatoxin-induced liver tumours. 

A variety of different mathematical dose-response models may be used to provide a 
parsimonious description of the observed dose-response relationship. Simple tolerance 
distribution, models, although often sufficiently flexible to provide a good fit to 
dose-response data, are somewhat naive in terms of their underlying biological basis as 
a possible mechanism for carcinogenesis. Stochastic models based on the notion that 
carcinogenesis results from the random occurrence of one or more fundamental 
biological events are more appealing, but are necessarily based on strong but uncertain 
assumptions. Foremost among such mechanistic models is the Armitage-Doll multi¬ 
stage model, based on the assumption that a cell progresses through a number of 
distinct stages before becoming cancerous, the transition intensity function for each 
stage being a linear function of dose. 

Since many compounds may require metabolic activation before being converted 
into their active form, consideration may be given to pharmacokinetic models for this 
process. This is particularly important when certain steps such as absorption, 
elimination, activation or detoxification are saturable. Even if the response rate is 
directly proportional to the dose level of the activated complex reaching the target 
tissue, such saturation effects may account for nonlinearity in the dose-response 
relationship when expressed as a function of the administered rather than of the 
delivered dose. 

Given a suitable model for the dose-response relationship, estimates of certain 
quantiles of the curve may be of interest. Because human exposure to most 
environmental carcinogens is low, there has been considerable interest in obtaining 
estimates of risk in the low-dose region based on the downward extrapolation of results 
obtained at higher doses. Unfortunately, this is subject to considerable model 
uncertainty, and different models, all equally consonant with the observed data, give 
widely different projections upon extrapolation of low doses. Because of this, a linear 
extrapolation to low doses is often advocated as the most prudent approach. This will 
be particularly appropriate in cases where the background response rate can be 
considered to arise from an at least partially dose-wise additive model. 

This low-dose model dependency may be circumvented by restricting attention to 
those quantiles lying well within the observable response range. Historically, quantiles 
such as the TD 5n (the dose estimated to induce tumours in 50% of exposed animals) 
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have been used as the basis for various measures of carcinogenic potency. The TD 50 
itself has recently been used by Gold et al. (1984) to demonstrate variations in 
carcinogenic potency spanning eight orders of magnitude. 

Time-to-tumour models attempt to describe the carcinogenic process in detail and 
make use of information on individual tumour occurrence and survival times. For 
observable tumours, parametric models addressing the time to first occurrence of this 
tumour have been developed on biological principles. The parameters of these models 
allow for inferences to be drawn on both the time and dose dependency of tumour 
incidence. For occult tumours the situation is more complicated: tumour incidence is 
not directly observable and its estimation depends on the experimental design, 
particularly the use of a series of interim sacrifices. However, within this framework, 
estimates for the relevant functions characterizing the carcinogenic response are 
derived, if identifiable. This provides the basis not only for unbiased statistical 
inference but also for clear biological conclusions. Time-to-tumour models make use of 
as much information as possible; they are in tum based on some assumptions, but they 
provide the most thorough description of the observed response in a long-term animal 
experiment. 


LIST OF ESSENTIAL SYMBOLS - CHAPTER 6 (in order of appearance) 


P(d) 

GO) 

P*(d) 

*0 

DAt) 

O2O) 

e 

P*(d, 6 ) 

«, 

0 

0 

V 

v" 

n (d) 

d q 

G, 

N 

T 

OjVT 

d t 


probability of treatment-induced tumour at dose d 
tolerance distribution 

probability of spontaneous or treatment-induced tumour at dose d 

background response rate (probability of spontaneous tumour) 

administered dose at time f 

activated dose at time t 

effective dose under steady state conditions 

vector of parameters 8 it .... 8 P 

probability of response (tumour) at dose d dependent on parameter 0 
dose levels (d 0 — 0 , d l < • ■ ■ <d,) 
number of animals with tumour at dose d : 
number of animals at dose d, 

likelihood of observed outcome under any dose-response model 
P*(d;0) 

maximum likelihood estimate of 0 
vector of zeros 

variance-covariance matrix of Vn ( 0 1 — 0 ) 

(r, ,r)th element of the inverse of V 

added risk over background at dose d 

dose corresponding to the added risk, q, over background 

measure of carcinogenic potency based on d q 

disease-free state of animal 

animal with tumour 

death from a cause unrelated to the tumour of interest 
death due to tumour of interest 


X 

Y 

Z 

m 

m 

m 

SO) 

(N.B. A sub 


f* 


a* 

b k 


Ck 


<4 

L x 

1 2 

1 3 

1 4 

(N.B. LLiO 

K 

5/ 


A 

LL 





PM3000456665 


Source: https://www.industrydocuments.ucsf.edu/docs/sncl0001 





MODEL FITTING 


145 


:y. The TD W 

X 

variations in 

Y 


Z 

in detail and 

m 

il times. For 

m 

Tence of this 

Mt) 

these models 

S(t) 

:y of tumour 

(N.B, 

incidence is 


mtal design. 

tk 

t framework, 


esponse are 

a k 

;d statistical 

b k 

make use of 


ans, but they 

c k 

■term animal 



d k 


U 

arauce) 

l 2 


Li 



t dose d 


lour) 

(N.B 


>arameter 0 


3ti$e model 


K 


Pi 

LL 

z 

A(f; z) 
^o(0 


time to onset of tumour (random variable) 

time to death due to tumour (random variable) 

time to death from an unrelated cause (random variable) 

density function 

hazard function 

cumulative hazard function 

survival function 

subscript X or Y to the quantities /(f), A(f), A(f) and 5(f) indicates the 
corresponding functions for the random variable X or T) 
time of observation of an event such as death or the occurrence of 
tumour (k=tl,... ,K) 

number of animals with appearance of a visible tumour at time t k 
number of animals with death caused by tumour of interest (fatal 
context) at time t k 

number of animals with death from unrelated cause, tumour of 
interest present (incidental context), at time t k 
number of animals dying without tumour of interest at time t k 
likelihood in the case of visible tumours 

likelihood in the case of occult tumours ail observed in a fatal context 
likelihood in the case of occult tumours all observed in an incidental 
context 

likelihood in the case of occult tumours observed in fatal or incidental 
context 

Li (i = 1, 2, 3, 4) denotes the logarithm of the likelihoods L, (i = 1, 2, 3, 4)) 
number of stages or transformations required before tumour occurs 
kinetic rate constants for the ;'th stage of multistage model (j = 
1-k) 

constant time taken by the fth transformation of the multistage model 

0 = 1 ,..,*) 

Weibull shape parameter for group / (i = 1, . . . , /) 

log-likelihood for an experiment with l groups under the Weibull 

model with known iv and k 

vector of covariates observed for each animal 

hazard function of animal with covariates z 

baseline hazard function 


▲ 


PM3000456666 


Source: https://www.industrydocuments.ucsf.edu/docs/sncl0001 



